-
Notifications
You must be signed in to change notification settings - Fork 11
feat: decimal support #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
WalkthroughAdds Decimal support across docs, dataframe serialization, mpdecimal compatibility, line-sender APIs and error codes, protocol v3 support, rpyutils big-endian conversion, tests and mock-server updates, CI and build tweaks, and a c-questdb-client submodule update. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User code
participant API as Buffer / SenderTransaction
participant Compat as mpdecimal_compat
participant Rpy as rpyutils (qdb_mpd_to_bigendian)
participant Serializer as Line Sender
participant ILP as ILP Buffer
User->>API: row(..., columns={'a': Decimal(...)})
API->>Compat: decimal_pyobj_to_binary(PyDecimal)
Note right of Compat `#f0f7de`: extract mpd_t (flags, exp, digits)
alt NaN or Inf
Compat-->>API: special indicator (empty payload, scale=0)
else Normal Decimal
Compat->>Rpy: qdb_mpd_to_bigendian(limbs, len, radix, exp, sign)
Rpy-->>Compat: (unscaled_bytes, byte_width)
Compat-->>API: (unscaled_bytes, encoded_scale)
API->>Serializer: format decimal field (scale, width, bytes)
Serializer->>ILP: append decimal field to ILP buffer (protocol v1–3)
end
ILP-->>User: serialized ILP message
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/questdb/ingress.pyx (1)
1043-1049: Minor: fix valid-types error message (missing comma concatenates entries)
'datetime.datetime' 'numpy.ndarray'concatenates into one token.Apply this diff:
- 'TimestampMicros', - 'datetime.datetime' - 'numpy.ndarray')) + 'TimestampMicros', + 'datetime.datetime', + 'numpy.ndarray'))src/questdb/ingress.pyi (1)
1030-1039: AddDecimalto thecolumnstype signature.The
Sender.rowmethod'scolumnsparameter is missingDecimalin its type union, while bothBuffer.row(line 386) andSenderTransaction.row(line 207) include it. This inconsistency will cause type checking errors when users try to pass Decimal values toSender.row.Apply this diff to add Decimal to the type union:
columns: Optional[ - Dict[str, Union[bool, int, float, str, TimestampMicros, datetime, np.ndarray]] + Dict[str, Union[bool, int, float, str, TimestampMicros, datetime, np.ndarray, Decimal]] ] = None,
🧹 Nitpick comments (6)
src/questdb/ingress.pyx (1)
2414-2429: Doc nit: mention protocol version 3Property doc explains v1 and v2 only. Consider adding a short note for v3 to avoid confusion.
src/questdb/dataframe.md (1)
96-106: Decimal docs read wellClear coverage across pandas/NumPy/Arrow with examples; fits new tests. Consider adding the supported scale range (0–76) note here for completeness.
Also applies to: 129-157
src/questdb/ingress.pyi (1)
709-711: Clarify null representation for Decimal columns.The table shows
Y (NaN)for nulls in the Decimal row. However,NaNis typically associated with float types. For Decimal objects, nulls are represented asNoneorpandas.NA, notNaN. Consider changing this to justYorY (None)for clarity.Apply this diff if you agree:
* - ``'object'`` (``Decimal`` objects) - - Y (``NaN``) + - Y - ``DECIMAL``src/questdb/mpdecimal_compat.h (1)
1-19: Document CPython version compatibility assumptions.This compatibility layer relies on CPython's internal
Decimalimplementation details (struct layout and limb size). These internals may change between CPython versions. Consider:
- Adding a comment documenting which CPython versions are supported (e.g., 3.8+)
- Adding runtime checks in the Cython code to verify struct layout hasn't changed
- Noting in documentation that this is a best-effort compatibility layer
Example comment to add:
+/* + * Compatibility layer for CPython's decimal module (libmpdec). + * Tested with CPython 3.8 through 3.12. + * May break with future CPython versions if internal Decimal layout changes. + */ + /* Determine the limb type used by CPython's libmpdec build. */ #if SIZE_MAX == UINT64_MAXsrc/questdb/dataframe.pxi (2)
59-73: Add comments explaining byte-swap usage for Arrow decimals.The
bswap32andbswap64functions are used later for Arrow decimal types (lines 2226, 2245, etc.), but it's not immediately clear why byte-swapping is needed. Arrow stores decimal values in big-endian format, while the ILP protocol expects a specific byte order.Add a comment explaining the endianness conversion:
+# Arrow decimal types store values in big-endian format (network byte order). +# These functions convert to the format expected by the ILP protocol. cdef inline uint32_t bswap32(uint32_t value):
2213-2295: LGTM! Arrow decimal serialization correctly handles all bit widths.All four Arrow decimal serialization functions properly:
- Check Arrow validity bitmaps
- Send NULL for invalid values
- Perform correct byte-swapping for endianness conversion
- Use the stored scale from column metadata
The 128-bit and 256-bit handlers correctly swap both byte order within each 64-bit word and reverse the word order.
Optional: Consider a helper function to reduce duplication.
The four functions have similar structure. You could extract common logic:
cdef void_int _arrow_decimal_to_bytes( col_t* col, size_t byte_count, uint64_t* out_buffer, bint* valid_out) noexcept nogil: """Extract and byte-swap Arrow decimal to output buffer.""" # Common extraction and swapping logicThis would reduce duplication and make maintenance easier, though the current approach is also acceptable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
c-questdb-client(1 hunks)src/questdb/dataframe.md(1 hunks)src/questdb/dataframe.pxi(17 hunks)src/questdb/ingress.pyi(7 hunks)src/questdb/ingress.pyx(5 hunks)src/questdb/line_sender.pxd(3 hunks)src/questdb/mpdecimal_compat.h(1 hunks)src/questdb/mpdecimal_compat.pxd(1 hunks)test/test.py(4 hunks)test/test_dataframe.py(5 hunks)
🧰 Additional context used
🪛 Clang (14.0.6)
src/questdb/mpdecimal_compat.h
[error] 4-4: 'Python.h' file not found
(clang-diagnostic-error)
🪛 Ruff (0.14.1)
test/test_dataframe.py
90-90: Avoid specifying long messages outside the exception class
(TRY003)
93-93: Avoid specifying long messages outside the exception class
(TRY003)
95-95: Avoid specifying long messages outside the exception class
(TRY003)
97-97: Avoid specifying long messages outside the exception class
(TRY003)
102-103: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
- GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
- GitHub Check: questdb.py-questdb-client (Building and testing on mac)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux)
- GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
- GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
🔇 Additional comments (29)
c-questdb-client (1)
1-1: Submodule reference update requires verification of C extension changes.This file contains only a pointer update to the C extension submodule. The actual Decimal support implementation in the C extension (
questdb/c-questdb-client) is not accessible for review from this context.Given that the broader PR adds significant Decimal support across the Python wrapper (dataframe serialization, ILP ingestion, type signatures), ensure that:
- The C extension at commit
5b17715...includes corresponding Decimal serialization/deserialization logic.- The binary protocol changes (if any) are compatible with the Python-side changes.
- The submodule commit has been tested with the tandem QuestDB core PR (questdb/questdb#6068).
Note: This PR is marked "DO NOT MERGE" and depends on upstream changes.
To verify C extension compatibility, you may want to:
- Inspect the C extension diff at the target commit to ensure Decimal support aligns with the Python wrapper changes.
- Confirm that protocol version 3 support (mentioned in the AI summary) is implemented in the C extension.
- Verify integration tests pass with the updated submodule.
test/test.py (3)
45-49: V3 pandas tests import looks goodKeeps suite discoverable only when pandas is present.
415-425: Protocol-version validation updates are correctTreating 3 as valid and 4/'4' as invalid with the updated error text matches the new Sender/Buffer checks.
If CI still runs the “unsupported client for V3” test, please confirm it’s updated (or gated) to reflect that the client now supports V3.
Also applies to: 430-432
1478-1479: Public name fix for V2Renaming to “protocol version 2” is consistent with the class.
test/test_dataframe.py (3)
84-121: Decimal payload helpers are solidHelpers make intent clear and align with the binary format used in assertions.
570-585: Comprehensive decimal test coverageCovers pyobj decimals (incl. special values) and Arrow decimals across widths; version-gated appropriately.
Also applies to: 586-597, 598-608, 609-646
1705-1709: Updated error-message regexThe new wording (“Unsupported arrow type …”) matches current behavior.
src/questdb/mpdecimal_compat.pxd (1)
24-71: Decimal → ILP conversion helper looks correct
- Handles NaN/Inf as nulls.
- Builds unscaled integer from mpd limbs correctly (LE limbs × MPD_RADIX).
- Enforces max scale 76 and applies sign.
One note: zero encodes to an empty mantissa (length 0), which matches the tests’ “special values” treatment; confirm your wire format also expects empty mantissa for numeric zero, or adjust to emit a single 0x00 byte.
src/questdb/ingress.pyx (1)
1241-1260: Decimal support is limited to dataframe() path, not row(); verify PR description scope and consider scoping Decimal to dataframe onlyThe review comment is accurate. After examining the codebase:
- Buffer.row() columns parameter type hint excludes Decimal (only:
bool, int, float, str, TimestampMicros, TimestampNanos, datetime.datetime, numpy.ndarray)- Decimal support via
decimal_pyobj_to_binaryis implemented only in dataframe.pxi- No
_column_decimalmethod exists in the Buffer class; only_column_bool,_column_i64,_column_f64,_column_str,_column_ts_micros,_column_ts_nanos,_column_numpy- Decimal in dataframe requires protocol v3 (tests skip for
version < 3)If the PR description shows
sender.row(... Decimal(...)), the documentation/example is inconsistent with the implementation. Either add Decimal support to row() (requiring_column_decimaland protocol v3 guard) or scope examples/docs to dataframe-only.src/questdb/line_sender.pxd (2)
43-56: LGTM! Protocol version and error code additions are well-structured.The addition of
line_sender_error_invalid_decimalandline_sender_protocol_version_3follows existing conventions and provides the necessary foundation for Decimal support.
268-282: LGTM! Decimal buffer functions follow established patterns.The two new functions
line_sender_buffer_column_dec_strandline_sender_buffer_column_decare well-designed:
- Consistent with existing column buffer functions
- Support both text (string) and binary formats
- Include proper error handling via
err_outparametersrc/questdb/ingress.pyi (4)
43-61: LGTM! Import and error code additions are correct.The import of
Decimaland the addition ofDecimalErrorto theIngressErrorCodeenum are necessary for type checking support.
207-207: LGTM! Type signature correctly includes Decimal.The addition of
Decimalto thecolumnsparameter type union inSenderTransaction.rowenables proper type checking for decimal column values.
386-386: LGTM! Type signature correctly includes Decimal.The addition of
Decimalto thecolumnsparameter type union inBuffer.rowenables proper type checking for decimal column values.
407-456: LGTM! Documentation clearly illustrates Decimal usage.The example usage and type mapping table additions help users understand:
- How to pass Decimal values in the
columnsdict- The mapping from Python
Decimalto ILPDECIMALtypesrc/questdb/mpdecimal_compat.h (3)
21-35: Add runtime validation for struct layout assumptions.The
mpd_tandPyDecObjectstruct definitions assume a specific memory layout that matches CPython's internal implementation. If CPython changes these internals, this code will silently produce incorrect results or crash.Consider adding runtime checks in the Cython initialization code (e.g., in
mpdecimal_compat.pxdor module init) to verify:
- Size of Python Decimal objects matches expectations
- Basic sanity checks on extracted values (e.g., comparing against
decimalmodule's official API)Example validation approach:
# At module initialization test_decimal = Decimal("123.45") # Extract using compatibility layer # Also extract using official decimal API # Assert they matchThis would catch breaking changes early rather than producing silent corruption.
37-44: LGTM! Accessor functions correctly handle inline vs heap storage.The
decimal_digits()function properly handles both storage modes:
- Heap-allocated: uses
dec->dec.data- Inline (small decimals): uses
dec->data[4]This matches CPython's optimization for small decimal values.
46-54: LGTM! Flag definitions match mpdecimal constants.The flag enum and
MPD_RADIXconstant definitions are correct and consistent with libmpdec's public interface.src/questdb/dataframe.pxi (11)
96-110: LGTM! Enum additions for decimal target are correct.The addition of
col_target_column_decimal = 9and updatingcol_target_at = 10maintains the enum sequence. The target name "decimal" in_TARGET_NAMESis consistent with other entries.
152-179: LGTM! Decimal source types cover all supported formats.The five decimal source types provide comprehensive coverage:
col_source_decimal_pyobj: PythonDecimalobjectscol_source_decimal32/64/128/256_arrow: Arrow decimal types of different bit widthsThe inclusion in
_PYOBJ_SOURCE_DESCRenables clear error messages.
249-272: LGTM! Target-to-source mappings are complete.The
_TARGET_TO_SOURCESmapping correctly includes all five decimal source types for thecol_target_column_decimaltarget. The addition to_FIELD_TARGETSensures decimal columns are recognized as field columns.
397-406: LGTM! Dispatch codes follow established patterns.The five dispatch codes combining
col_target_column_decimalwith each decimal source type enable efficient routing in the serialization switch statement. This follows the same pattern used for other column types.
427-432: LGTM! Scale field addition is well-documented.The
scalefield incol_tcorrectly stores the decimal scale for Arrow types. The comment clearly indicates it's only used for Arrow decimals and defaults to 0.uint8_tis sufficient for the 0-76 scale range.
956-979: LGTM! Arrow decimal type resolution is comprehensive.The
_dataframe_series_resolve_arrowfunction correctly:
- Handles all four Arrow decimal bit widths (32/64/128/256)
- Validates scale is within the supported 0-76 range
- Provides clear error messages with GitHub issue link
- Stores the scale for later use in serialization
1046-1047: LGTM! Decimal detection in object columns is correct.The
isinstance(<object>obj, Decimal)check properly identifies Decimal objects in pandas object-dtype columns and sets the appropriate source type. This is consistent with other type detection logic in_dataframe_series_sniff_pyobj.
1163-1164: LGTM! ArrowDtype support enables Arrow-backed decimal columns.The handling of
_PANDAS.ArrowDtypeby delegating to_dataframe_series_resolve_arrowcorrectly enables support for Arrow-backed columns in pandas, including Arrow decimal types.
1300-1315: LGTM! Formatting improves readability of bitwise operations.The reformatting of the bitwise validity checks with explicit parentheses and line breaks makes the bit manipulation logic clearer without changing behavior.
2171-2210: LGTM! Python Decimal serialization handles all cases correctly.The serialization function properly handles:
- Null values (returns early)
- Special values like NaN/Inf (sends as NULL to server)
- Mantissa size validation (127-byte limit)
- Error reporting via IngressError
The use of
decimal_pyobj_to_binaryencapsulates the complex decimal-to-binary conversion logic.
2456-2465: LGTM! Dispatch switch correctly routes all decimal types.The five decimal dispatch cases properly route to their respective serialization functions. The GIL handling is correct:
decimal_pyobjdoesn't passgs(requires GIL)- Arrow variants pass
gs(can release GIL)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/questdb/dataframe.pxi (1)
972-977: Verify the decimal scale range (0-76) is correct.The code validates that the Arrow decimal scale is between 0 and 76 inclusive. Please confirm this range matches QuestDB's decimal column constraints and is documented in the database's decimal support specifications.
If this range is database-specific, consider adding a constant or comment referencing the specification:
+ # QuestDB decimal scale range: 0-76 (see: https://questdb.io/docs/reference/sql/datatypes/#decimal) if arrowtype.scale < 0 or arrowtype.scale > 76:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/questdb/dataframe.pxi(17 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T07:31:37.016Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.
Applied to files:
src/questdb/dataframe.pxi
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
- GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
- GitHub Check: questdb.py-questdb-client (Building and testing on mac)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux)
- GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
- GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
🔇 Additional comments (6)
src/questdb/dataframe.pxi (6)
59-73: Byte-swap utilities look correct.The inline
bswap32andbswap64functions implement standard endianness conversion. These are essential for converting Arrow's little-endian decimal representations to the big-endian format expected by the ILP protocol.Ensure the byte-swapping operations are thoroughly tested, especially for the complex multi-chunk swaps used in decimal128 and decimal256 serialization (lines 2265-2267, 2287-2291).
96-97: Decimal enum and mapping additions look correct.The new
col_target_column_decimaltarget, associated sources, and dispatch codes are consistently integrated into the existing enum and mapping structures. Thecol_target_atoffset correctly shifted from 9 to 10.Also applies to: 110-110, 179-179, 249-255, 271-272, 397-406
432-432: Scale field addition is appropriate.The
scalefield is used to store the Arrow decimal type's scale (set during resolution on line 978) and is unused for non-decimal columns. For PythonDecimalobjects, the scale is extracted during serialization from the object itself.
1046-1047: Decimal type detection is well-integrated.The decimal sniffing in object columns (lines 1046-1047) and ArrowDtype handling (lines 1163-1164) follow the established patterns and integrate cleanly with existing type detection logic.
Also applies to: 1163-1164
2456-2465: Decimal dispatch integration looks correct.The dispatch logic correctly routes each decimal dispatch code to its corresponding serialization function, following the established pattern for other column types.
2171-2296: Verify decimal mantissa size limit and byte-swapping correctness.The decimal serialization implementation has well-tested code paths for Python Decimals and Arrow decimal types, with proper special value handling (NaN/Infinity produce NULL markers). However:
Undocumented 127-byte mantissa limit (line 2197): The hardcoded limit lacks any explanation. Verify this aligns with QuestDB's ILP wire format specification—it appears reasonable for ~307 decimal digits but needs confirmation it's not arbitrary or outdated.
Complex byte-swapping for multi-chunk decimals (decimal128 at lines 2265–2267, decimal256 at lines 2287–2291): The chunk reversal and byte-swap logic is correct in principle but should be validated against actual serialized output, especially endianness correctness across platforms.
No comments on pointer arithmetic: The bit-shift indexing (
offset << 1,offset << 2) is correct but could benefit from a brief inline comment explaining the multi-chunk array indexing.Confirm:
- 127-byte limit matches QuestDB protocol constraints
- Byte-swap output is verified correct via integration testing
- Logic handles edge cases (min/max values, overflow near 127-byte boundary)
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
test/test.py (1)
437-437: Remove debug print statement.This print statement appears to be a debug artifact that should be removed before merging.
Apply this diff:
- print("Testing bad protocol version:", version)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
test/test.py(7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/test.py (2)
test/test_dataframe.py (1)
TestPandasProtocolVersionV3(1806-1808)test/mock_server.py (2)
HttpServer(124-211)port(210-211)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
- GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
- GitHub Check: questdb.py-questdb-client (Building and testing on mac)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
- GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux)
- GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
🔇 Additional comments (6)
test/test.py (6)
27-27: LGTM: Import updated for expanded protocol version coverage.The import correctly reflects the expanded protocol version support (v1, v2, v3, and v4), aligning with changes in mock_server.py.
48-48: LGTM: Protocol version v3 test class import.The import of TestPandasProtocolVersionV3 is consistent with the existing pattern for v1 and v2, and is properly conditional on pandas availability.
420-421: LGTM: Bad protocol version test correctly updated.The test now correctly validates that version 4 is invalid (since only versions 1-3 are supported), and the error message accurately reflects the valid range.
Also applies to: 430-431
1152-1153: LGTM: HTTP auto protocol version test extended to v3.The test correctly validates that when the server supports v1, v2, and v3, the client auto-negotiates to the highest version (v3).
1176-1177: LGTM: Unsupported protocol version test updated.The test now correctly validates the scenario where the server only supports v4 (which the client doesn't support), ensuring proper forward-compatibility error handling.
1479-1479: LGTM: Bug fix for incorrect test class name.The name attribute now correctly reflects that this test class is for protocol version 2, not version 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
test/system_test.py (1)
149-151: Use string variant for “_str” test.
test_auth_tls_insecure_skip_verify_strpasses a bool; use the string form to match the test’s intent.- self._test_scenario(self.qdb_auth, 'tcps', tls_verify=False, **AUTH) + self._test_scenario(self.qdb_auth, 'tcps', tls_verify='unsafe_off', **AUTH)src/questdb/ingress.pyx (1)
1048-1057: Fix error message list; add missing comma and include Decimal/TimestampNanos.Current message concatenates
'datetime.datetime' 'numpy.ndarray'and omits new types.- valid = ', '.join(( - 'bool', - 'int', - 'float', - 'str', - 'TimestampMicros', - 'datetime.datetime' - 'numpy.ndarray')) + valid = ', '.join(( + 'bool', + 'int', + 'float', + 'str', + 'TimestampMicros', + 'TimestampNanos', + 'datetime.datetime', + 'numpy.ndarray', + 'Decimal'))
♻️ Duplicate comments (2)
test/system_test.py (1)
222-222: Fix timestamp type threshold: use< (9, 1, 0), not<=.9.1.0 should use TIMESTAMP_NS; current check treats it as pre‑9.1.0.
-exp_ts_type = 'TIMESTAMP' if self.qdb_plain.version <= (9, 1, 0) else 'TIMESTAMP_NS' +exp_ts_type = 'TIMESTAMP' if self.qdb_plain.version < (9, 1, 0) else 'TIMESTAMP_NS'src/questdb/dataframe.pxi (1)
152-157: Remove unnecessary GIL requirement for Arrow decimal sources.Arrow decimal serializers run nogil except error paths; marking them GIL-required degrades perf for whole DF serialization.
- col_source_decimal_pyobj = 701100 - col_source_decimal32_arrow = 702100 - col_source_decimal64_arrow = 703100 - col_source_decimal128_arrow = 704100 - col_source_decimal256_arrow = 705100 + col_source_decimal_pyobj = 701100 + col_source_decimal32_arrow = 702000 + col_source_decimal64_arrow = 703000 + col_source_decimal128_arrow = 704000 + col_source_decimal256_arrow = 705000
🧹 Nitpick comments (1)
ci/run_tests_pipeline.yaml (1)
85-91: Temporary decimal-branch test block: ensure removal before merge.The comment says “Remove before merging decimal support PR”. Please gate or remove this block prior to merge into main.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
ci/run_tests_pipeline.yaml(4 hunks)src/questdb/dataframe.pxi(17 hunks)src/questdb/ingress.pyx(8 hunks)src/questdb/mpdecimal_compat.pxd(1 hunks)test/system_test.py(5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T07:31:37.016Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.
Applied to files:
src/questdb/ingress.pyxsrc/questdb/dataframe.pxisrc/questdb/mpdecimal_compat.pxd
🧬 Code graph analysis (1)
test/system_test.py (1)
src/questdb/ingress.pyi (7)
Sender(821-1153)row(202-217)row(380-485)row(1030-1049)dataframe(219-230)dataframe(487-769)dataframe(1051-1099)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
- GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
- GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
- GitHub Check: questdb.py-questdb-client (Building and testing on mac)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
- GitHub Check: questdb.py-questdb-client (Building and testing on linux)
- GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
- GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
🔇 Additional comments (9)
test/system_test.py (4)
32-32: Version pin looks fine; remember to bump for release.Update
QUESTDB_VERSIONto the first release with decimals before merging this PR.
36-36: Confirm decimal feature floor.Ensure
FIRST_DECIMAL_RELEASE = (9, 2, 0)matches the server release that first exposes DECIMAL. Adjust if the final release differs.
273-293: Decimal (python object) test looks good. Verify type name stability.Expected column type
DECIMAL(18,3)matches current behavior; double‑check it won’t change across target server versions.
294-327: Arrow decimal: expected scale vs dtype scale.DataFrame uses Arrow dtype with scale 2, but asserts
DECIMAL(18,3)and values with 3 fractional digits. Confirm server’s canonicalization to scale=3 is intended and stable; otherwise, consider making the expectation reflect Arrow scale.ci/run_tests_pipeline.yaml (1)
82-83: Env wiring for master tests looks correct.src/questdb/ingress.pyx (2)
145-147: BadDataFrame computed offset is intentional—OK.Design keeps Python-only code after last C enum. No C mapping needed.
Based on learnings
830-834: Protocol v3: validation message and range check look good.src/questdb/dataframe.pxi (2)
956-979: Decimal serialization paths look correct. Two sanity notes.
- Arrow endianness handling and sign-propagation via two’s complement look fine.
- Python
Decimalpath delegates todecimal_pyobj_to_binaryand respects the 127‑byte mantissa cap; with the mpdecimal fix (see separate comment), this avoids false rejections.If available, add a negative Arrow decimal case in tests to exercise sign handling for 32/64/128/256.
Also applies to: 2170-2202, 2217-2300
1301-1316: Bitmask helpers: correctness looks good.Validity and bit extraction logic are correct and match Arrow layout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
test/system_test.py (1)
81-83: Document the version override mechanism.The environment variable
TEST_QUESTDB_INTEGRATION_FORCE_MAX_VERSIONallows testing pre-release features (like decimal support) against development builds of QuestDB. This is useful for tandem PR development but should be documented.Consider adding a comment explaining:
- The purpose of this override
- When it should be used (e.g., testing against QuestDB builds from source)
- That it's intended for development/CI, not production use
+ # Allow testing pre-release features against development QuestDB builds + # Set TEST_QUESTDB_INTEGRATION_FORCE_MAX_VERSION=1 to bypass version checks if os.environ.get('TEST_QUESTDB_INTEGRATION_FORCE_MAX_VERSION') == '1': cls.qdb_plain.version = (999, 999, 999) cls.qdb_auth.version = (999, 999, 999)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
c-questdb-client(1 hunks)test/system_test.py(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- c-questdb-client
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
Applied to files:
test/system_test.py
🧬 Code graph analysis (1)
test/system_test.py (1)
src/questdb/ingress.pyi (7)
Sender(821-1153)row(202-217)row(380-485)row(1030-1049)dataframe(219-230)dataframe(487-769)dataframe(1051-1099)
🔇 Additional comments (6)
test/system_test.py (6)
11-11: LGTM!The
decimalimport is correctly added at the module level and is necessary for the new decimal test cases.
26-26: LGTM!Setting
pyarrow = Nonein the except block properly addresses the past review feedback about guarding the pyarrow import. This allows the@skipIf(not pyarrow, ...)decorators to function correctly when pyarrow is not installed.
222-222: LGTM!The version comparison fix correctly uses
<instead of<=, ensuring that QuestDB 9.1.0 and later versions properly returnTIMESTAMP_NSwhile earlier versions returnTIMESTAMP. This addresses the past review feedback.
273-295: LGTM!The
test_decimal_py_objtest is well-structured:
- Properly gated by the
FIRST_DECIMAL_RELEASEversion check- Creates a table with explicit
DECIMAL(18,3)schema- Correctly validates both column metadata and inserted values
- Aligns with QuestDB's decimal handling
297-333: LGTM!The
test_decimal_pyarrowtest correctly validates decimal handling through pandas DataFrames with pyarrow backing:
- Properly decorated with
@skipIffor optional dependencies- Version-gated by
FIRST_DECIMAL_RELEASE- The test expects
DECIMAL(18,3)and values with 3 decimal places even though the DataFrame usespyarrow.decimal128(18, 2)(scale 2), which correctly reflects QuestDB's default behavior of storing decimals with scale 3Based on learnings
32-36: Version availability confirmed; code is correct.QuestDB version 9.1.1 exists and 9.2.0 was released on November 13, 2025. The version strategy in the code is sound:
QUESTDB_VERSION = '9.1.1'references an available releaseFIRST_DECIMAL_RELEASE = (9, 2, 0)aligns with the decimal feature now being released- Decimal tests will correctly activate when QuestDB 9.2.0+ is installed
The code requires no changes. The "DO NOT MERGE" label appears intentional, likely held pending the decimal release, which has now occurred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/questdb/mpdecimal_compat.h(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
🪛 Clang (14.0.6)
src/questdb/mpdecimal_compat.h
[error] 3-3: 'Python.h' file not found
(clang-diagnostic-error)
🔇 Additional comments (5)
src/questdb/mpdecimal_compat.h (5)
7-18: Platform detection logic is correct.The conditional typedef based on
SIZE_MAXcorrectly distinguishes 32-bit and 64-bit platforms, and the correspondingMPD_RADIX_CONSTvalues (10^19 and 10^9) match libmpdec's implementation.
36-38: Inline accessor is correct.The cast and field access are straightforward. The lack of null checking is acceptable for this low-level accessor, as callers are expected to validate inputs.
40-43: Correctly handles inline vs. heap-allocated decimal data.The conditional return properly distinguishes between small decimals using the inline
data[4]array and larger decimals requiring heap allocation viadec->dec.data.
45-51: Flag definitions are standard and correct.The enum values match libmpdec's flag definitions, and
MPD_FLAG_SPECIAL_MASKprovides a convenient way to check for any special value (infinity or NaN).
53-53: LGTM!Correct initialization of the radix constant from the platform-specific macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
test/system_test.py (1)
293-328:test_decimal_pyarrowcorrectly validates Arrow-backed DECIMAL behavior.The test combines
pyarrowand pandas to send a decimal column with Arrow scale 2, while asserting aDECIMAL(18,3)column and 3‑decimal string values (-99999.990,-678.000,None), which aligns with QuestDB’s default DECIMAL(18,3) storage semantics. You might optionally setmin_rows=3inretry_check_tableto mirror the number of inserted rows, but functionally this is fine as‑is.Based on learnings
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
test/system_test.py(4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
Applied to files:
test/system_test.py
🧬 Code graph analysis (1)
test/system_test.py (1)
src/questdb/ingress.pyi (7)
Sender(821-1153)row(202-217)row(380-485)row(1030-1049)dataframe(219-230)dataframe(487-769)dataframe(1051-1099)
🔇 Additional comments (5)
test/system_test.py (5)
11-26: Imports and optional dependency handling look correct.Using a plain
decimalimport plus guardingpandas/pyarrowin atry/exceptand settingpyarrow = Noneintegrates cleanly with the@skipIf(not pyarrow)/@skipIf(not pd)decorators and avoids premature ImportError when these extras are absent.
32-37: Version bump and decimal feature gate are consistent.Updating
QUESTDB_VERSIONto'9.2.0'and introducingFIRST_DECIMAL_RELEASE = (9, 2, 0)provides an explicit guard for decimal tests and matches the PR objective to run them only once decimal support is available.
218-226: Timestamp type version check now matches TIMESTAMP_NS introduction.The
exp_ts_typecondition using< (9, 1, 0)correctly treats 9.1.0 and newer servers asTIMESTAMP_NS, resolving the previous off‑by‑one behavior.
269-291:test_decimal_py_objgives solid coverage for direct Decimal ingestion.The test exercises HTTP ILP ingestion of a
DECIMAL(18,3)column from a PythonDecimal, verifies the column metadata, and asserts the stored value, all gated behindFIRST_DECIMAL_RELEASEso it won’t run on older servers.
331-332:__main__unittest entrypoint is fine.Keeping
unittest.main()under the usual guard is standard and harmless for this test module.
amunra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outstanding changes required:
- Clean-up of mpd binary buffer extraction.
- A few more decimal.Decimal tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
test/test_dataframe.py (1)
609-653: Arrow decimal tests look correct; still missing explicit boundary coverageThe Arrow decimal tests validate all four widths, null handling, and payload layout (scale + byte_width + big‑endian two’s‑complement mantissa) against
_decimal_binary_payload, which matches the new Cython serializers. What’s still missing, per the earlier review, is explicit coverage of values at or very near the representable limits for each decimal width (both positive and negative), to fully exercise the mpdecimal extraction and overflow behavior. Consider extendingarrow_caseswith boundary‑case mantissas for eacharrow_type.
🧹 Nitpick comments (1)
test/test_dataframe.py (1)
84-121: Decimal payload helpers match ILP layout; messages are fine for testsThe constant and helpers
_decode_decimal_payload,_decimal_from_unscaled, and_decimal_binary_payloadcorrectly mirror the binary decimal layout (marker, format type, scale, width, mantissa) and are well‑suited for asserting encoder behavior. Only minor note: Ruff’s TRY003 warnings about long messages can be ignored here since these are test‑only assertions, but if you want zero warnings you could factor the repeated messages into small helper functions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/questdb/dataframe.pxi(17 hunks)test/test_dataframe.py(5 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
Applied to files:
test/test_dataframe.pysrc/questdb/dataframe.pxi
📚 Learning: 2025-10-22T07:31:37.016Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.
Applied to files:
src/questdb/dataframe.pxi
🧬 Code graph analysis (1)
test/test_dataframe.py (1)
src/questdb/ingress.pyi (1)
IngressError(65-70)
🪛 Ruff (0.14.5)
test/test_dataframe.py
90-90: Avoid specifying long messages outside the exception class
(TRY003)
93-93: Avoid specifying long messages outside the exception class
(TRY003)
95-95: Avoid specifying long messages outside the exception class
(TRY003)
97-97: Avoid specifying long messages outside the exception class
(TRY003)
102-103: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (7)
test/test_dataframe.py (3)
570-608: Good coverage of pyobj Decimal values, normalization, and specialsThe three tests (
test_decimal_pyobj_column,test_decimal_pyobj_trailing_zeros_and_integer,test_decimal_pyobj_special_values) give solid coverage: protocol‑version gating, numeric equality from decoded (scale, mantissa), canonicalization of trailing zeros, and special values encoded as empty mantissas with scale 0. This aligns well with the newdecimal_pyobj_to_binarypath in the dataframe serializer.
1714-1714: Updated ArrowDtype error message aligns with new resolverThe regex change to match
Unsupported arrow type int16 for column 'a'...is consistent with_dataframe_series_resolve_arrow, which now reports unsupported ArrowDtype types in that form. This keeps tests and error reporting in sync.
1813-1816: Protocol v3 test class wiring is correct
TestPandasProtocolVersionV3mirrors the v1/v2 classes, and the new decimal tests are correctly gated onself.version >= 3, so v3 coverage is enabled without affecting older protocol versions.src/questdb/dataframe.pxi (4)
3-7: Decimal target/source wiring andscalefield are consistentThe new decimal target (
col_target_column_decimal = 9), its name mapping, decimal sources (pyobj + 4 Arrow widths),_TARGET_TO_SOURCESentry,_FIELD_TARGETSinclusion, and correspondingcol_dispatch_code_tentries are all internally consistent.col_source_decimal_pyobjis correctly marked as GIL‑requiring (hundreds digit = 1) while Arrow decimal sources are not, matching the serialization pattern and allowing the main loop to drop the GIL when only Arrow decimals are present. The addedscalefield oncol_tis zero‑initialized viacallocand only populated for Arrow decimal sources, which is exactly how the serializers consume it.Also applies to: 86-111, 115-157, 174-181, 183-260, 263-273, 287-407, 427-433
1300-1315: Minor bitmap helpers change is purely stylisticThe tweaks in
_dataframe_arrow_get_booland_dataframe_arrow_is_validonly reformat the bitmask expression and add parentheses; semantics stay identical and are in line with the rest of the Arrow bitmap usage.
2170-2215: Pyobj Decimal serializer is correct and enforces sensible limits
serialize_decimal_py_objcleanly delegates todecimal_pyobj_to_binary, handles special values via amantissa is Noneconvention, enforces the 127‑byte mantissa limit before callingline_sender_buffer_column_dec, and properly converts the returned bytes to aconst uint8_t*._dataframe_serialize_cell_column_decimal__decimal_pyobjrespects the existing null semantics (_dataframe_is_null_pyobj) and simply skips nulls, mirroring the integer/float paths. This is a good fit with the new tests for regular, trailing‑zero, and special Decimals.
2217-2280: Arrow decimal serializers and dispatch cases look correctThe four Arrow decimal serializers:
- Use
_dataframe_arrow_is_validfor null‑bitmask handling, matching other Arrow numeric serializers.- Read raw values from
buffers[1]with the correct width:
- 32‑bit: single
uint32_t,- 64‑bit: single
uint64_t,- 128‑bit: two
uint64_ts per value (offset << 1),- 256‑bit: four
uint64_ts per value (offset << 2).- Apply
bswap32/bswap64and reverse word order for 128/256‑bit values so the mantissa is emitted in big‑endian two’s‑complement form.- Pass
col.scaleand the correct byte lengths (4, 8, 16, 32) toline_sender_buffer_column_dec.- Only touch C buffers and
line_sender_*functions in the fast path, taking the GIL via_ensure_has_gilonly on error, which matches how other Arrow serializers are written and justifies leaving the decimal Arrow sources marked as “no GIL required”.The additional dispatch cases in
_dataframe_serialize_cellcorrectly route each decimal target/source combination to its respective serializer, so decimal columns fully participate in the existing switch‑style dispatch without falling into the default error branch.Also applies to: 2439-2448
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
test/system_test.py (1)
269-291: Well-structured test for Python Decimal ingestion.The test properly validates decimal support for Python
decimal.Decimalobjects sent via therow()method. Version gating, explicit table creation withDECIMAL(18,3), and verification of both metadata and data are all correctly implemented.Optionally, consider adding a test case with
NULLdecimal values to ensure completeness:sender.row( table_name, columns={'dec_col': None}, # NULL value at=qi.ServerTimestamp)However, the current basic test is sufficient for initial coverage.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
test/system_test.py(4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
Applied to files:
test/system_test.py
🧬 Code graph analysis (1)
test/system_test.py (1)
src/questdb/ingress.pyi (7)
Sender(821-1153)row(202-217)row(380-485)row(1030-1049)dataframe(219-230)dataframe(487-769)dataframe(1051-1099)
🔇 Additional comments (4)
test/system_test.py (4)
26-26: Pyarrow import guard correctly addressed.The
pyarrow = Nonefallback in the exception handler properly addresses the previous review comment about guarding the import. Tests decorated with@unittest.skipIf(not pyarrow, ...)will now skip gracefully when pyarrow is unavailable.
218-218: Version comparison fix correctly applied.The change from
<=to<properly addresses the previous critical review comment. Version 9.1.0 introducedTIMESTAMP_NS, and this comparison now correctly maps versions before 9.1.0 to'TIMESTAMP'and 9.1.0+ to'TIMESTAMP_NS'.
293-326: DataFrame decimal ingestion test looks correct.The test properly validates pandas DataFrame ingestion with
pyarrow.decimal128(18, 2)dtype. The expectations are correct:
- Input values with scale 2 are converted to the table's
DECIMAL(18,3)definition- Output values are correctly padded to 3 decimal places (
'-99999.990','-678.000')This aligns with the documented QuestDB behavior where the table's scale takes precedence over the input scale. Based on learnings.
32-36: Version constants correctly updated for decimal support in QuestDB 9.2.0.QuestDB 9.2.0 introduced native DECIMAL type support, confirming that the
QUESTDB_VERSIONbump to'9.2.0'and the newFIRST_DECIMAL_RELEASEconstant are correctly set. The constant is appropriately used in decimal tests for version gating, aligning with the PR checklist objective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/questdb/dataframe.pxi (1)
1163-1164: ArrowDtype branch still assumes pandas hasArrowDtype; consider guarding or constraining pandasThis
elif isinstance(dtype, _PANDAS.ArrowDtype):branch will still raiseAttributeErroron pandas versions that don’t defineArrowDtype, which then gets wrapped asIngressError(InvalidApiCall, ...)for any dtype. Prior review already suggested either cachingArrowDtypeviagetattr(pandas, "ArrowDtype", None)in_dataframe_may_import_deps()and guarding theisinstancechecks, or tightening the minimum pandas version to one that guaranteesArrowDtypeexists and documenting that. The same concern still applies here.Check which minimum pandas version in your supported range first introduced `pandas.ArrowDtype`, and confirm whether `py-questdb-client` currently enforces at least that version in `pyproject.toml` / packaging metadata. If not enforced, consider adding a guard around `_PANDAS.ArrowDtype` as previously suggested.
🧹 Nitpick comments (4)
setup.py (1)
81-81: Good addition of build dependency tracking.Explicitly passing the
dependsparameter ensures the extension is rebuilt when the static library files change. This is a useful improvement to the build system.The
list(extra_objects)copy at line 81 is defensive but unnecessary sinceextra_objectsis already a list and isn't modified afterward. You could simplify to:- depends = list(extra_objects) - return Extension( "questdb.ingress", ["src/questdb/ingress.pyx"], include_dirs=[ "c-questdb-client/include", "rpyutils/include", np.get_include()], library_dirs=lib_paths, libraries=libraries, extra_compile_args=extra_compile_args, extra_link_args=extra_link_args, extra_objects=extra_objects, - depends=depends, + depends=extra_objects, define_macros = [ ('NPY_NO_DEPRECATED_API', 'NPY_1_7_API_VERSION') ] )Also applies to: 95-95
rpyutils/include/rpyutils.h (1)
115-124: Documentoutbuffer size andout_sizesemantics more explicitlyThe new API is clear at a type level but its contract is underspecified:
- Callers aren’t told how large
outmust be (the Rust impl assumes a fixed 32‑byte buffer).- It’s also not stated whether
*out_sizeis purely an output (number of bytes written) or has any in/out meaning.Consider extending the comment to state that:
outmust point to a buffer of at least 32 bytes, and- on success
*out_sizeis set to the number of valid bytes at the end of that buffer.This will make the FFI safer for external consumers without changing behavior.
test/test_dataframe.py (2)
11-12: Decimal helpers and pyobj decimal tests look correct and give strong coverage
- The helpers (
DECIMAL_BINARY_FORMAT_TYPE,_decode_decimal_payload,_unwrap_decimal,_decimal_from_unscaled,_decimal_binary_payload) match the documented binary layout and reconstruct values as expected, including sign and arbitrary precision.test_decimal_pyobj_columnexercises a wide range of values (including extremes across 64/128/256‑bit ranges and both signs) and directly compares the on‑wire(scale, unscaled)with theDecimaldecomposition, which should shake out limb/sign/overflow issues.- Trailing‑zero and special‑value tests nicely pin down normalization behavior and the rejection of NaN/±Infinity.
Two small, non‑blocking nits you may consider:
- In
test_decimal_pyobj_column, narrowing theexcept Exceptiontoqi.IngressError(or re‑raising unexpected exceptions) would address the BLE001 warning and make failures slightly more precise.- For
_unwrap_decimal, you could simplify the scale logic a bit (e.g. computeunscaledonce and derivescaledirectly fromexponent) but the current form is perfectly readable and correct.Overall, this is high‑value coverage for the new decimal path.
Also applies to: 84-134, 583-637
639-683: Arrow decimal tests correctly validate binary payloads; minor robustness tweak possibleThe Arrow decimal cases validate:
- All four width classes (32/64/128/256) using
arrow_type.scaleandarrow_type.byte_width.- Exact equality of the on‑wire payload to
_decimal_binary_payload(unscaled, scale, byte_width).- That rows with null/invalid decimals omit the decimal field entirely (only the
countfield remains).One minor edge case, only relevant if you later add more than one
Noneinunscaled_values: in theunscaled is Nonebranch youcontinuewithout advancingoffset, so a subsequent iteration would still search from the previous row’s position. Consider updatingoffsetthere as well (e.g. by searching for the bareb' count={count}i\n'suffix) if you extend these tests in future.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
rpyutils/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
rpyutils/Cargo.toml(1 hunks)rpyutils/include/rpyutils.h(1 hunks)rpyutils/src/lib.rs(1 hunks)rpyutils/src/mpd_to_bigendian.rs(1 hunks)setup.py(2 hunks)src/questdb/dataframe.pxi(17 hunks)src/questdb/line_sender.pxd(3 hunks)src/questdb/mpdecimal_compat.pxd(1 hunks)src/questdb/rpyutils.pxd(1 hunks)test/test_dataframe.py(5 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-10-22T07:31:37.016Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.
Applied to files:
src/questdb/mpdecimal_compat.pxdsrc/questdb/dataframe.pxisrc/questdb/line_sender.pxd
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
Applied to files:
src/questdb/dataframe.pxitest/test_dataframe.py
🧬 Code graph analysis (4)
rpyutils/include/rpyutils.h (1)
rpyutils/src/mpd_to_bigendian.rs (1)
qdb_mpd_to_bigendian(26-48)
rpyutils/src/lib.rs (1)
rpyutils/src/mpd_to_bigendian.rs (1)
mpd_to_bigendian(91-109)
rpyutils/src/mpd_to_bigendian.rs (1)
proj.py (1)
test(89-95)
test/test_dataframe.py (1)
src/questdb/ingress.pyi (1)
IngressError(65-70)
🪛 Ruff (0.14.5)
test/test_dataframe.py
90-90: Avoid specifying long messages outside the exception class
(TRY003)
93-93: Avoid specifying long messages outside the exception class
(TRY003)
95-95: Avoid specifying long messages outside the exception class
(TRY003)
97-97: Avoid specifying long messages outside the exception class
(TRY003)
102-103: Avoid specifying long messages outside the exception class
(TRY003)
614-614: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (7)
rpyutils/src/lib.rs (1)
25-27: Module wiring formpd_to_bigendianlooks correctNew module is declared consistently with existing ones and will compile the new FFI implementation into the static library.
rpyutils/Cargo.toml (1)
11-15: Addingi256as a runtime dependency is reasonableUsing
i256as a dedicated 256‑bit integer type matches the new decimal/limb conversion logic; the manifest change is minimal and scoped.If you haven’t already, please confirm this exact
i256version matches the one used/validated in the main QuestDB repo and that no extra features are required from the crate.src/questdb/rpyutils.pxd (1)
62-72: qdb_mpd_to_bigendian Cython declaration matches the C headerThe signature (const pointers,
uint32_t exp,bintforbool,size_t *out_size) is consistent withrpyutils.h, so the binding surface looks correct.src/questdb/line_sender.pxd (1)
31-46: New decimal error/enum values and buffer APIs are consistent with existing patternsAppending
line_sender_error_invalid_decimal, introducingline_sender_protocol_version_3, and adding the two decimal column functions all follow the existing FFI style: samebint/err_outpattern, and enum values are extended rather than reordered, so existing codes stay stable.Please double‑check that:
- These declarations exactly match
questdb/ingress/line_sender.h(names, parameter types, andscalerange), and- The new
line_sender_error_invalid_decimalis correctly mapped to the new Python‐sideIngressErrorCodewithout touching the Python‑onlyBadDataFramecode path (per the ingress.pyx learning).Also applies to: 53-57, 268-283
test/test_dataframe.py (1)
1744-1745: Updated error message expectation and protocol v3 test wiring are consistent
- The new regex
"Unsupported arrow type int16 for column 'a'.*github"matches the more descriptive error text and keeps the test aligned with the current user‑facing message.TestPandasProtocolVersionV3cleanly reuses the existingTestPandasBase.TestPandasmixin, so all DataFrame tests (including the new decimal ones) now run under protocol version 3 as intended.No further changes needed here.
Also applies to: 1843-1846
rpyutils/src/mpd_to_bigendian.rs (1)
26-48: Core mpdecimal→i256 conversion and FFI wrapper look correct and well‑testedThe limbs reduction, sign handling, 10^exp scaling, trimming, and FFI marshalling all line up with two’s‑complement semantics, and the tests cover the important boundaries (0, ±2²⁵⁵, overflow, varying radix/exp, and FFI error paths). I don’t see correctness or safety issues here; this module is good to go.
Also applies to: 50-109, 111-418
src/questdb/dataframe.pxi (1)
59-74: Decimal dataframe plumbing (targets, sources, dispatch, and serializers) looks coherentThe new decimal target/source enums,
_TARGET_TO_SOURCES/_FIELD_TARGETSmappings,col_dispatch_code_tvariants, andcol_t.scalefield all line up, and the serializers use them consistently:
- Pyobj Decimal columns are detected in
_dataframe_series_sniff_pyobjand serialized viaserialize_decimal_py_obj, which defers todecimal_pyobj_to_binaryand thenline_sender_buffer_column_dec.- Arrow Decimal32/64/128/256 columns are resolved through
_dataframe_series_resolve_arrow, which validatesarrowtype.scalein[0, 76], stores it incol.scale, and sets the appropriatecol_source_decimal*_arrow.- The Arrow decimal serializers correctly respect the validity bitmap, byte‑swap from Arrow’s little‑endian storage to big‑endian, reorder 128/256‑bit limbs, and pass the right widths (4/8/16/32 bytes) plus
col.scaleintoline_sender_buffer_column_dec.- Decimal Arrow sources are marked as non‑GIL (702000–705000) while the pyobj source remains GIL‑bound (701100), so the main loop can still run without the GIL when only Arrow decimals are present.
Given the tests in this PR around dataframe/Decimal ingestion, this wiring looks internally consistent and in line with the intended ILP Decimal encoding.
Also applies to: 86-112, 115-157, 173-181, 183-260, 262-273, 287-407, 427-433, 956-979, 1046-1047, 2170-2202, 2204-2265, 2426-2435
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
test/test_dataframe.py (2)
583-616: Consider catching specific exception types.The test uses a bare
except Exceptionwhich can mask unexpected errors. While acceptable in test code for diagnostics, it's better to catch specific exception types likeqi.IngressErrorandAssertionError.Apply this diff to improve exception handling:
try: buf = _dataframe(self.version, df, table_name='tbl', at=qi.ServerTimestamp) (scale, mantissa) = _decode_decimal_payload(buf.splitlines()[0]) unscaled = int.from_bytes(mantissa, byteorder='big', signed=True) (expected_scale, expected_unscaled) = _unwrap_decimal(decimal) self.assertEqual(scale, expected_scale) self.assertEqual(unscaled, expected_unscaled) - except Exception as ex: + except (qi.IngressError, AssertionError) as ex: self.fail(f'Failed to serialize {decimal}: {ex}')
629-638: Consider usingassertRaisesfor clearer test intent.The test for special values uses a try/except pattern that could mask unexpected errors. Using
assertRaiseswould make the test intent clearer and more robust.Apply this diff:
def test_decimal_pyobj_special_values(self): if self.version < 3: self.skipTest('decimal datatype requires ILP version 3 or later') df = pd.DataFrame({'dec': [Decimal('NaN'), Decimal('Infinity'), Decimal('-Infinity')]}) - try: - _dataframe(self.version, df, table_name='tbl', at=qi.ServerTimestamp) - self.fail("special values shouldn't be encoded") - except qi.IngressError: - pass + with self.assertRaises(qi.IngressError): + _dataframe(self.version, df, table_name='tbl', at=qi.ServerTimestamp)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
test/test_dataframe.py(5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
Applied to files:
test/test_dataframe.py
🧬 Code graph analysis (1)
test/test_dataframe.py (1)
src/questdb/ingress.pyi (1)
IngressError(65-70)
🪛 Ruff (0.14.5)
test/test_dataframe.py
90-90: Avoid specifying long messages outside the exception class
(TRY003)
93-93: Avoid specifying long messages outside the exception class
(TRY003)
95-95: Avoid specifying long messages outside the exception class
(TRY003)
97-97: Avoid specifying long messages outside the exception class
(TRY003)
102-103: Avoid specifying long messages outside the exception class
(TRY003)
614-614: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (5)
test/test_dataframe.py (5)
11-11: LGTM!The Decimal import is necessary for the new decimal support functionality.
84-133: LGTM!The helper functions correctly implement decimal encoding/decoding logic:
_decode_decimal_payloadproperly parses and validates the binary format_unwrap_decimalcorrectly converts Decimal to unscaled/scale representation_decimal_binary_payloadgenerates the proper big-endian binary format_decimal_from_unscaledreconstructs Decimals using the correctscalebmethod
659-703: LGTM!This test comprehensively covers Arrow decimal types:
- All four decimal types (decimal32, decimal64, decimal128, decimal256) with appropriate scales
- Null value handling
- Binary payload validation
- Protocol version compatibility
1764-1764: LGTM!The updated error message pattern is more descriptive, specifying both the arrow type and column name, which improves clarity for users.
1863-1866: LGTM!The new test class follows the established pattern for protocol version testing and will properly exercise all decimal-related tests with protocol version 3.
This PR adds decimal support.
This is a tandem pr for:
Usage
Decimal object
Progress
Summary by CodeRabbit
New Features
Documentation
Bug Fixes
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.